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Recurrent rearrangements of chromosome 1 q21 .1 that occur via non-allelic homologous recombination have 
been associated with variable phenotypes exhibiting incomplete penetrance, including congenital heart dis- 
ease (CHD). However, the gene or genes within the ~1 Mb critical region responsible for each of the associated 
phenotypes remains unknown. We examined the 1 q21 .1 locus in 948 patients with tetralogy of Fallot (TOF), 1 488 
patients with other forms of CHD and 6760 ethnically matched controls using single nucleotide polymorphism 
genotyping arrays (lllumina 660W and Affymetrix 6.0) and multiplex ligation-dependent probe amplification. We 
found that duplication of 1 q21 .1 was more common in cases of TOF than in controls [odds ratio (OR) 30.9, 95% 
confidence interval (Cl)8.9-107.6); P = 2.2 x 10 7 ], but deletion was not. In contrast, deletion of 1 q21 .1 was 
more common in cases of non-TOF CHD than in controls [OR 5.5 (95% C1 1 .4-22.0); P = 0.04] while duplication 
was not. We also detected rare (n = 3) 100-200 kb duplications within the critical region of 1 q21 .1 in cases of 
TOF. These small duplications encompassed a single gene in common, GJA5, and were enriched in cases of 
TOF in comparison to controls [OR = 10.7 (95% C1 1.8-64.3), P = 0.01]. These findings show that duplication 
and deletion at chromosome 1 q21 .1 exhibit a degree of phenotypic specificity in CHD, and implicate GJA5 as 
the gene responsible for the CHD phenotypes observed with copy number imbalances at this locus. 



INTRODUCTION 

Congenital heart disease (CHD) is the most common congeni- 
tal anomaly, with an incidence of ~7 per 1000 live births (1). 



CHD may occur as part of recognized chromosomal and Men- 
delian syndromes (2-4), but in 80% of cases it manifests as a 
non-syndromic, non-Mendelian condition. Significant familial 
recurrence risk has been demonstrated in such 'sporadic' CHD 
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cases, indicating complex genetic contributions to the risk of 
most CHD (5,6). The most common cyanotic form of CHD 
is tetralogy of Fallot (TOF), which occurs in ~1 in 2500 
live births (1). TOF is characterized by a ventricular septal 
defect, over-riding of the aortic valve due to anterocephalad 
deviation of the outlet septum, right ventricular outflow tract 
obstruction and right ventricular hypertrophy. 

Recently, rearrangement hotspots in the human genome 
which occur via non-allelic homologous recombination 
(NAHR), and result in recurrent copy number imbalances, 
have been identified (7,8). One such locus is situated at 
1 q2 1.1; copy number variants (CNVs) at this locus have been 
associated with variable phenotypes exhibiting incomplete 
penetrance (9-12), including both syndromic (9-11) and non- 
syndromic (isolated) (11,13) forms of CHD. A recent study by 
Greenway etal. (13) identified lq21.1 copy number imbalances 
in 5 out of 5 12 isolated TOF cases. Although these findings were 
highly statistically significant when compared with controls 
(P = 0.0002), they as yet await replication for TOF. Moreover, 
the gene or genes responsible for TOF risk at this locus remain 
unidentified among the 1 1 RefSeq genes in the ~ 1 Mb critical 
region of 1 q2 1.1. Of these genes, gap junction protein ct-5 
(GJA5, Connexin40, Cx40) has previously been proposed as 
the candidate gene for several cardiac disease phenotypes, in- 
cluding CHD (14-17). Both Gja5 heterozygous (18%) and 
homozygous-null (33%) mice exhibit complex heart defects, 
including conotruncal and endocardial cushion defects (14). 
However, no GJA5 point mutation or G/v45-specific CNV has 
been found in CHD patients to date (9,13). We thus examined 
the 1 q2 1 . 1 locus in a large cohort of isolated CHD patients 
(n = 2436) in order to estimate more precisely the contribution 
of 1 q2 1 . 1 rearrangements to CHD risk, and to identify the 
causative gene for CHD at this locus. 

RESULTS 

CNV analyses and concordance between methods 
of detection 

We performed genome-wide CNV analysis on a total of 807 TOF 
cases and 84 1 controls using a combination of the PennCNV (18) 
and QuantiSNP (19) CNV-calling algorithms on data generated 
using Illumina 660W-Quad chips and found one duplication 
locus that reached statistical significance for enrichment in 
TOF versus controls at lq21.1 (7/807 versus 0/841; P = 
0.007). A subset of the same TOF patients (n = 198) was also 
analyzed with the Birdseye algorithm on data generated using 
Affymetrix 6.0 chips. Additionally, we performed targeted multi- 
plex ligation-dependent probe amplification (MLPA) analysis on 
the 1 q2 1 . 1 locus in 570 TOF probands, 429 of which were also 
typed on Illumina 660W, yielding a total of 948 TOF cases exam- 
ined in this study. A total of 1488 non-TOF CHD cases with clin- 
ical diagnoses as shown in Table 1 were analyzed using the 
PennCNV and QuantiSNP algorithms on Illumina 660W-Quad 
data; MLPA on the 1 q2 1 . 1 locus was also performed on 428 of 
these individuals. When data from the four methods of detection 
in all cases were merged, we found 100% concordance in CNVs 
detected in the 1 q2 1 . 1 locus between the methods used in this 
study. All CNVs that were detected from the Illumina 660W 
arrays in individuals that had not also been analyzed with 



Table 1. Non-TOF CHD phenotype distribution 



Cardiac malformation n 



Ventricular septal defect 162 

Atrial septal defect 290 

Atrioventricular septal defect 63 

Transposition of the great arteries 1 73 

Congenitally corrected transposition of the great arteries 37 

Common arterial trunk 24 

Double-outlet left ventricle 1 

PA with ventricular septal defect 1 8 

Double-outlet right ventricle 1 8 

Pulmonary stenosis 79 

PA with intact ventricular septum 1 8 

Aortic valve abnormalities 1 3 1 

Hypoplastic left heart syndrome 1 5 

Mitral valve abnormalities 23 

Double inlet left ventricle or right ventricle 23 

Ebstein malformation 14 

Tricuspid valve abnormalities 32 

Aortic arch abnormalities 161 

Patent ductus arteriosus 66 

Partial anomalous pulmonary venous drainage 1 9 

Total anomalous pulmonary venous drainage 9 

Left isomerism 12 

Right isomerism 1 1 

Univentricular heart 14 

Situs inversus/dextrocardia 5 

Heterotaxy 8 

Coronary artery anomaly 3 

Other ' 59 

Total 1488 



another independent method (n = 8) were successfully validated 
with MLPA. 

Frequency of lq21.1 rearrangements in control 
populations 

We examined the frequency of NAHR-mediated 1 q2 1 . 1 rear- 
rangements in 841 individuals from a French population 
cohort, 5919 WTCCC2 control individuals and 4150 control 
individuals from previously published studies (8,20,21) that 
used high-density single nucleotide polymorphism (SNP) plat- 
forms comparable to those used in this study. Three duplica- 
tions and four deletions in 10910 controls were observed 
(see Supplementary Material, Table SI). 

Microduplication of lq21.1 is strongly associated 
with sporadic, isolated TOF 

We identified NAHR-mediated duplications of 1 q2 1 . 1 that 
span the previously reported critical region (9,10) in eight un- 
related TOF probands (Fig. 1 and Table 2). The duplications 
were found to be de novo in one proband, inherited from an 
unaffected mother in three probands and of unknown inherit- 
ance (due to unavailability of parental samples) in the remain- 
ing four probands. There were no occurrences of 1 q2 1 . 1 
deletion in our TOF cohort. Thus, our findings showed that 
1 q2 1 . 1 microduplications are strongly associated with TOF 
[8/948 versus 3/10910; P = 2.2x 10" 7 ; odds ratio (OR) = 
30.9, 95% confidence interval (CI) = 8.9-107.6; Table 3], 
with population attributable risk (PAR) = 0.92%. In contrast, 
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Figure 1. The 1 q2 1 . 1 region (adaptation from the UCSC hgl8 Genome Browser) and the summary of findings in TOF and non-TOF mixed CHD cohort. (A) The 
region of lq21.1 is complex (143.5-147.5 Mb is shown) due to the presence of extensive segmental duplication blocks and the existing gaps in the reference 
human genome sequence (NCBI Build 36.1). The largest pair of segmental duplications with >99% homology that mediates most of the rearrangements in this 
locus is indicated by large orange arrows. (B) RefSeq genes in the region. The critical region is indicated by translucent gray block. (C) The coverage of the 
Illumina 660W-Quad (the main platform used in this study) and the location of the custom-designed MLPA probes used in this study are shown. (D) Overview of 
1 q2 1 . 1 duplications (blue bars) and deletions (red bars) identified in our study. The cardiac phenotype is shown after the patient identifier. TGA, transposition of 
the great arteries; MV, mitral valve dysplasia with ventricular septal defect; ASD, atrial septal defect. 



we found no evidence that microdeletions of 1 q2 1 . 1 are asso- 
ciated with TOF (0/948 versus 4/10910; Table 3). Additional 
phenotypic details of all probands with lq21.1 CNVs are 
presented in Supplementary Material, Table S2. 

GJA5 duplication is associated with TOF 

In addition to NAHR-mediated events, we found 100-200 kb 
rare duplications within the critical region of lq21.1 in three 
patients with TOF. All of these duplications encompass 
GJA5, a strong candidate gene for CHD (14-17). We did 
not find any deletion within the critical region of 1 q2 1 . 1 in 
our TOF cohort. This smaller duplication variant occurred in 
2/6760 controls. Thus, small duplications encompassing the 
GJA5 gene were also enriched in our TOF cohort compared 
with controls (P = 0.01; OR = 10.7, 95% CI = 1.8-64.3). 
Additionally, we found a GJA5 triplication in one patient 
with pulmonary atresia (PA), like TOF, a cardiac outflow 
tract phenotype, in the non-TOF CHD cohort (Fig. 2). 

Microdeletion of lq21.1 is associated with isolated 
congenital heart defects 

Examination of the lq21.1 locus in 1488 cases with other forms 
of isolated CHD (mixed, non-TOF) revealed three NAHR- 



mediated deletions and no duplications that spanned the entire 
critical region (Fig. 1 and Table 2). Thus, 1 q2 1 . 1 deletion was 
associated with isolated non-TOF CHD (3/1488 versus 
4/10910; P=0.04; OR=5.5, 95% CI =1.4-23.2 with 
PAR = 0.17%, Table 3). In contrast, we found no evidence of 
association between 1 q2 1 . 1 duplication and non-TOF CHD 
(0/1488 versus 3/10910; Table 3). The CHD phenotypes of 
the cases with 1 q2 1 . 1 deletion were transposition of the great 
arteries, atrial septal defect and mitral valve dysplasia with ven- 
tricular septal defect (Supplementary Material, Table S2). 



DISCUSSION 

In 948 isolated TOF cases, we found strong association 
between duplication at chromosome 1 q2 1 . 1 and disease risk. 
We found no association between 1 q2 1 . 1 deletion and TOF 
risk. In contrast, among 1488 cases of other isolated CHD phe- 
notypes, we detected association between deletion, rather than 
duplication, at 1 q2 1 . 1 and disease risk. Our findings therefore 
indicate associations between duplication or deletion at this 
region and CHD that are to a degree specific for particular 
CHD phenotypes, a novel observation. Furthermore, in the 
present study, we detected overlapping rare duplication 
variants of 100-200 kb in size within the critical region of 
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Table 2. Summary of 1 q2 1 . 1 CNVs in 2436 CHD patients 



Chr Start 


Length (kb) 


CN 


Patient ID 


parental DNA 


Inheri-tance 


Pheno-type 


Illumina660 


Affy 6.0 


MLPA 










availability 






QS 


PC 






1 144106312 


1742 


3 


CHA-937.1 




n/a 


TOF 


Y 


Y 


n/a 


Y 


1 144943150 


1350 


3 


(jOCHD-982.1 




n/a 


TOF 


Y 


Y 


n/a 


Y 


1 %AAf\A*y"%Cf\ 

1 144943150 


1350 


3 


NOl 1-107.1 


P+M 


inh-m 


TOF 


Y 


Y 


n/a 


Y 


1 144943150 


1350 


3 


CHA- 102.1 




n/a 


TOF 


Y 


Y 


n/a 


Y 


1 144967972 


1325 


3 


CHA- 137.1 


P+M 


dn 


TOF 


Y 


Y 


Y 


Y 


1 144967972 


1325 


3 


CHA-363.1 


P+M 


inh-m 


TOF 


Y 


Y 


n/a 


Y 


1 144967972 3 


1321 


3 


CHA-574.1 


M 


inh-m 


TOF 


n/a 


n/a 


n/a 


Y 


1 144967972 


880 


3 


CHA-867.1 


M 


n/a 


TOF/PA 


Y 


Y 


n/a 


Y 


1 145594226 


254 


3 


LEU-30.1 


P+M 


inh-m 


TOF 


Y 


Y 


n/a 


Y 


1 145594226 


254 


3 


LEU-98.1 




n/a 


TOF 


Y 


Y 


n/a 


Y 


1 145658466 


118 


3 


CHA-620.1 


P+M 


inh-m 


TOF 


Y 


Y 


Y 


Y 


1 145658465 


118 


4 


NOTT-319.1 




n/a 


PA 


Y 


Y 


n/a 


Y 


1 144967972 


1419 


1 


S YD- 1499.1 




n/a 


TGA 


Y 


Y 


n/a 


Y 


1 144967972 


1325 


1 


FCH-397.1 




n/a 


ASD 


Y 


Y 


n/a 


Y 


1 144967972 


1325 


1 


NOTT-674.1 




n/a 


MV/VSD 


Y 


Y 


n/a 


Y 



Chr, chromosome; CN, copy number; QS, QuantiSNP; PC, PennCNV; Affy, Affymetrix; Y, yes; n/a, not available; dn, de novo; inh-m, inherited from the mother; 
P, paternal; M, maternal; PA, pulmonary atresia; ASD, atrial septal defect; MV/VSD, mitral valve dysplasia with ventricular septal defect; TGA, transposition of 
the great arteries. 

a The proband was typed on the Illumina 660W array but failed SNP QC (low call rate) and thus excluded from the array analyses. However, the mother of the 
proband was also typed on the Illumina 660W array and passed QC. DNAs from both the proband and mother were analyzed with MLPA, which showed full 
lq21.1 duplications with the same breakpoints. This also confirmed the array data from the mother, which passed QC. Thus, the coordinates listed here were 
inferred from the mother who transmitted the duplication to the respective proband. 



Table 3. Phenotypic specificity of lq21.1 rearrangements in isolated CHD 



Patient cohort 


1 q2 1 . 1 microduplications 
n P-value 


OR (95%CI) 


lq21.1 microdeletions 
n P-value 


OR (95%CI) 


TOF (n = 948) 
Non-TOF (« = 1488) 


8 2.2 x 10~ 7 
0 NS 


30.9 (10.2-119.0) 


0 NS 
3 0.04 


5.5 (1.4-22.0) 



NS, not significant. 



1 q2 1 . 1 that are also enriched (P = 0.01) in our TOF cohort. 
These small duplication variants encompass only a single 
gene in common, i.e. GJA5, thus indentifying GJA5 as a 
critical CHD gene in this locus. 

Chromosome lq21.1 deletion was first proposed as a cause 
for CHD by Christiansen et al. (11), who found deletions that 
span the entire critical region of 1 q2 1 . 1 in one syndromic and 
two non-syndromic CHD cases among 505 patients referred 
for clinical genetic evaluation of suspected DiGeorge or Wil- 
liams' syndrome. All three of the deletion carrying probands 
had obstruction of the aortic arch as part of their phenotype. 
However, the specificity of that phenotypic association was 
likely to have been heavily influenced by selection bias, 
since aortic arch interruption and supravalvular aortic stenosis 
are classic cardiovascular manifestations of DiGeorge and 
Williams' syndromes, respectively. More recently, deletion 
of 1 q2 1 . 1 was shown to be present more frequently in patients 
with variable phenotypes, who had been referred to diagnostic 
centres principally for mental retardation accompanied by 
other features, than in controls (n = 25/5218 patients; 0/4737 
controls; P = 1.1 x 10 ) (9). Twelve of the 25 deletion car- 
riers had CHD as a feature. A second study of 16 557 patients 
referred to a clinical cytogenetics laboratory who were exam- 
ined by array comparative genomic hybridization for a range 
of abnormalities revealed 21 probands with microdeletion 



and 15 with microduplication. However, only 1 of these 36 
patients had CHD without other strong environmental predis- 
posing factors, a patient with a duplication who had a complex 
heart defect (10). No previous study has estimated the fre- 
quency of 1 q2 1 . 1 rearrangements in patients with mixed 
CHD phenotypes ascertained on the basis of their heart 
disease, rather than on the basis of suspected syndromic fea- 
tures; our results demonstrate a modest excess of lq21.1 dele- 
tion in such patients, and no evidence of association with 
lq21.1 duplication. In the case of sporadic, isolated TOF, 
Greenway et al. (13) found four lq21.1 duplications and one 
lq21.1 deletion in 512 cases and no occurrence in 2265 con- 
trols (P = 0.007 and P = 0.18 for lq2 1.1 duplication and de- 
letion, respectively). Our results confirm that CNV at 1 q2 1 . 1 is 
strongly associated with isolated TOF, and in a cohort almost 
twice as large as that investigated by Greenway et al. (13), we 
demonstrate that duplication is much more strongly associated 
with TOF than is deletion, for which we found no evidence of 
association. Interestingly, specificity of 1 q2 1 . 1 copy number 
imbalances has been previously described in other associated 
phenotypes: 1 q2 1 . 1 duplications but not deletions have been 
found to be associated with macrocephaly and autism spec- 
trum disorders, while 1 q2 1 . 1 deletions but not duplications 
were found to be associated with microcephaly and schizo- 
phrenia (10,22). 
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Figure 2. Small duplications encompassing GJA5 within the critical region of lq21.1. Five CNVs were identified within the critical region of 1 q2 1 . 1 in 948 TOF 
and 1488 non-TOF CHD cases. Rare duplications of 100-200 kb in size (shown as blue bars) were found in 3/948 TOF cases encompassing a single gene in 
common: GJA5. Probands LEU-30 and LEU-98 were found to be distantly related, with estimated genome-wide IBD sharing probabilities for sharing (0,1,2) 
alleles IBD to be (0.8581, 0.1369, 0.0050). However, their estimated IBD-sharing probabilities within the 3 Mb region surrounding GJA5 are considerably higher 
(0, 0.64, 0.36), and both of them carry duplications with identical breakpoints. Thus, these two observations are likely to represent one ancestral duplication 
event. Appropriate correction for the distant relatedness of these two individuals was made in the statistical analyses (see Supplementary Material). In 1488 
non-TOF CHD cases, a triplication variant (blue bar) encompassing GJA5 was found in one patient with PA and one deletion variant (red bar) encompassing 
the last exon of a non-coding LOC100289211 gene was found in one patient with transposition of the great arteries (TGA). RefSeq genes in the region, the 
coverage for the Illumina660W platform and location of the MLPA probes are indicated in the hgl8 UCSC Genome Browser (http:// genome.ucsc.edu). 



It has been speculated that the lq21.1 region harbours a 
single causal gene critical for both cardiovascular and brain 
development that accounts for both aspects of the rearrange- 
ment phenotype, but previous studies had not been able to 
establish this (9,13). In our TOF cohort, we identified rare 
smaller duplications within the critical region of 1 q2 1.1, all 
of which encompass GJA5, the strongest candidate gene for 
the CHD phenotype in this locus. These overlapping small 
GJA5 duplications are rare (3/948) in comparison to the 
NAHR-mediated duplications (9/948) found in TOF cases, 
but we also found them to be significantly enriched in TOF 
when compared with controls (P = 0.01). With the exception 
of one patient with PA (a cardiac outflow defect like TOF) 
who has a GJA5 triplication, we found no such variant in 
our non-TOF CHD cohort. Our results therefore suggest dupli- 
cation of GJA5 as the most likely mechanism responsible for 
the association of NAHR-mediated duplication at 1 q2 1 . 1 and 
TOF risk. It is not possible to infer directly from our data 
that GJA5 deletion is responsible for the association of 
NAHR-mediated deletion at 1 q2 1 . 1 and the risk of other 
forms of isolated CHD, although this seems likely. 

GJA5 encodes the cardiac gap junction protein connexin-40, 
which has key functions in cell adhesion and cell-cell com- 
munication. Mice with genetically engineered deletion of 
Gja5 have a variety of complex congenital cardiac malforma- 
tions, in particular of the cardiac outflow tract (14). There are 
as yet no data from animal models of Gja5 overexpression, 



although such data would be of evident interest. However, 
mice overexpressing Gjal (Cx43), another connexin isotype, 
have outflow tract defects (23,24). While the original report 
of the Gja5 knockout mouse suggested that Gja5 was not 
expressed in neural crest cells in the mouse, more recent 
work has disputed this finding (14,25). The second heart 
field plays a critical role in the development of the cardiac 
outflow tract, and mutations in genes expressed in the 
second heart field result in outflow tract defects both in 
mouse models and in humans. Gja5 was recently shown to 
be expressed in cells derived from the second heart field 
during outflow tract development, where it is regulated by 
the key cardiac transcription factor Hand2 (25). TOF patients 
are highly prone to atrial and ventricular arrhythmias in later 
life which represent a significant source of morbidity and mor- 
tality. A number of lines of evidence implicate dysregulation 
of GJA5 in atrial arrhythmogenesis (26-29). It would be of 
interest to determine whether there is differential susceptibility 
to atrial arrhythmia in TOF patients with and without 
duplication at lq21.1 involving GJA5. 

Our data do not exclude the possibility that other genes in 
the 1 q2 1 . 1 region also contribute to CHD risk. Among the 
possible other candidate genes, CHD1L has been shown to 
be overexpressed in patients with TOF, double-outlet right 
ventricle and infundibular pulmonary stenosis compared with 
controls (30). PRKAB2, which encodes the (32 subunit of 
adenosine monophosphate-activated protein kinase, is highly 
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expressed in the right ventricular outflow tract; mutations in 
PRKAG2, a y2 subunit of the same protein, have been found 
to cause some familial forms of hypertrophic cardiomyopathy 
(31). However, among a total of 2436 CHD cases, we did not 
find any smaller CNVs within the 1 q2 1 . 1 region that impli- 
cated any gene other than GJA5, suggesting that any contribu- 
tion of such CNVs to CHD risk, while not excluded by our 
findings, is of small magnitude. Although our findings at 
GJA5 are statistically significant (P = 0.01), do not require 
correction for multiple comparisons and are highly biological- 
ly plausible, replication of this result in a similarly large and 
ethnically homogeneous population of TOF patients will be 
of importance in due course. 

As in previous studies, we observed marked variable pene- 
trance of 1 q2 1 . 1 CNVs. The reasons for this observation 
remain uncertain. A double-hit model has been previously pro- 
posed to explain this variable expressivity (32). However, our 
power to robustly identify such 'second hits' in the small 
numbers of cases (n = 15) carrying 1 q2 1 . 1 CNVs in this 
study is low. Additionally, in all five TOF cases where we 
observed that the duplication was transmitted from an 
unaffected parent, transmission was maternal (P = 0.06). 
Although this finding is not significant, it is possible to specu- 
late that parent of origin effects could conceivably in part 
explain the marked variability in penetrance of cardiac 
defects with rearrangements in lq21 that has been observed 
in several previous studies. A larger study comparing the 
phenotype when the duplication is paternally or maternally 
transmitted would be required to address this. 

In summary, our study has defined the relationships between 
lq21.1 duplication and isolated TOF, and between the recipro- 
cal 1 q2 1 . 1 deletion and other forms of CHD. Duplication 
confers an OR for TOF of 31, and accounts for ~1% of the 
PAR of TOF, whereas deletion confers an OR and PAR for 
non-TOF CHD of 6 and 0.2%, respectively. Within the 
1 q2 1 . 1 region, we showed that duplication of GJA5 alone is 
associated with an ~ 10-fold increase in the risk of TOF, iden- 
tifying GJA5 as a critical gene for human heart malformation. 



MATERIALS AND METHODS 

Study subjects and sample collections 

Ethical approval was obtained from the local institutional 
review boards at each of the participating centres. Informed 
consent was obtained from all participants (or from a parent/ 
guardian, if the patient was a child too young to themselves 
consent). Individuals with TOF of Caucasian ancestry together 
with their parents (when available) were recruited from mul- 
tiple centres in Newcastle, Leeds, Bristol, Liverpool, Oxford, 
Nottingham, Leicester (all UK), Leuven (Belgium), Erlangen 
(Germany) and Sydney (Australia). Individuals with other 
forms of CHD of Caucasian ancestry were also recruited in 
Newcastle, Oxford, Nottingham, Leicester, Leuven and 
Sydney. Patients that exhibited clinical features of syndromic 
developmental abnormalities or learning difficulties were 
excluded from the study. Both adult and child patients were 
recruited. CHD diagnoses and the presence of any associated 
phenotypes were obtained from the medical record. Parents 
of cases not reporting a diagnosis of CHD did not undergo 



echocardiography. All DNAs were further screened for 
DiGeorge and Williams abnormalities and those found with 
such anomalies were excluded from the study. Control sub- 
jects consisted of 856 individuals of Caucasian ancestry free 
of reported CHD from a French population cohort and 6000 
individuals of Caucasian ancestry free of reported CHD 
from the Wellcome Trust Case-Control Consortium 2 
(WTCCC2) control cohort (3000 individuals from 1958 
British Birth Cohort and 3000 individuals from UK Blood 
Service). Controls did not undergo echocardiography. DNAs 
from cases were extracted from blood or saliva. DNAs from 
the French population cohort and UK Blood Service were 
blood-derived and DNAs from 1958 British Birth Cohort 
were cell line derived. 

Genotyping and quality-control criteria 

Genotyping for all CHD probands and the French control 
cohort was performed at the Centre National de Genotypage 
(Evry Cedex, France). Normalized intensity and genotype 
data were obtained from the genotyping centre. TOF patients 
and family members (907 affected offspring and 747 unaffect- 
ed parents), 1987 non-TOF CHD patients and 856 unrelated 
individuals from a French control cohort were genotyped on 
the Illumina 660W-Quad platform. Additionally, 206 of the 
same TOF individuals were also typed on Affymetrix 6.0 
arrays. Per sample SNP QC analyses were performed to 
exclude any duplicates, individuals with non Caucasian ances- 
try and gender mismatches. Samples with low call rate 
(<98.8%) and high heterozygosity were excluded. Additional 
intensity quality-control parameters were applied to minimize 
heterogeneity due to multi-centre variation in DNA source and 
treatment. Samples were excluded when they failed the fol- 
lowing QC criteria: (i) a standard deviation of autosomal log 
R ratio (LRR) values >3.0; (ii) GC wave factor of the LRR 
outside the range of —0.1 < X < 0.1 (18); (iii) a standard de- 
viation of B allele frequency values >0.15 after GC correction 
(19). Following these QC measures, 807 TOF patients and 697 
of their family members (8 affected and 689 unaffected), 1488 
patients with non-TOF CHD and 841 control individuals from 
the French cohort were included in the final analyses. The 
phenotype classification summary of non-TOF CHD indivi- 
duals can be found in Table 1. WTCCC2 controls were 
typed on Affymetrix 6.0 arrays. Further details on genotyping 
and QC criteria (n = 5919 passed QC) can be found in a pre- 
viously published study (33) (http://www.wtccc.org.uk/ccc2). 

CNV calling algorithms and verification criteria 

CNV detection on the Illumina 660W platform was performed 
using both PennCNV (18) and QuantiSNP (19) CNV-calling 
algorithms. CNV detection on the Affymetrix 6.0 platform 
was performed using the Birdseye algorithm from the Bird- 
suite (34) package. The number of markers used to call 
CNV within the ~ 1 Mb minimal region of NAHR-mediated 
events in the 1 q2 1 . 1 locus on the Illumina 660W and Affyme- 
trix 6.0 platforms are 235 and 640, respectively. Within the 
minimal region (~ 100 kb) of small duplications that encom- 
pass GJA5, 36 and 104 markers were available in the Illumina 
660W and Affymetrix 6.0 platforms, respectively. CNV calls 
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within the 1 q2 1 . 1 locus that appeared to be artificially split as 
the result of coverage gaps on the SNP platforms were exam- 
ined manually and joined (see Supplementary Material, 
Fig. SI and S2). All CNV calls with Bayes factor >50 were 
confirmed with MLPA and Affymetrix 6.0 array (19). CNV 
coordinates were based on NCBI build 36.1 (hgl8). 

MLPA design, assay and analysis 

We performed MLPA analyses of the critical region of lq21.1 
in 574 TOF probands (433 of which were also typed on the 
Affymetrix 6.0 and/or Illumina 660W arrays) and 433 
non-TOF CHD probands (all were also typed on the Illumina 
660W arrays). MLPA probes were designed with MAPD (35) 
software, using the default settings. The resulting list of candi- 
date probes was subjected to BLAT (36) search in order to 
ensure specificity and to obtain genomic positions. Candidate 
probes that overlapped known SNPs (37) and/or segmental 
duplications (38) [identified by using UCSC Extended DNA 
utility (39,40)] were excluded. The remaining candidate 
probes with the highest score given by the MAPD software 
that met all the MRC Holland guidelines (http://www. 
mrc-holland.com) were chosen for synthesis. Twenty probes 
(ranging from 100- 140 bp final product size) that targeted 
GJA5 (10), CHD1L (2), ACP6 (2), GJA8 (2), PRKAB2 (2), 
TXNIP (1) and ANKRD34A (1) were synthesized. Nine lq21 
probes were used for each MLPA assay in addition to two 
control synthetic probes targeting copy number neutral 
regions. All probe sequences are available upon request. 

All MLPA (41) assays were performed with reagents from 
the MRC Holland P200 kit (Amsterdam, the Netherlands) 
and custom design synthetic oligonucleotide probes (Integrated 
DNA Technology, IA, USA). Genomic DNA (100 ng in 5 ui 
TE) was denatured for 30 min at 95°C and cooled down to 
25°C before addition of 35 fmoles of synthetic custom design 
probes, 1 pi of P200 probe mix and 1.5 pi of MLPA buffer. 
This was followed by hybridization at 60°C for 16 h. Hybri- 
dized probes were ligated with 1 U of Ligase-65 for 1 5 min 
at 54°C, followed by ligase deactivation for 5 min at 98°C. 
Afterwards, 5 pi of the ligated products were added to 1 5 jjlI 
of polymerase chain reaction (PCR) buffer 2:13 dilution in 
H 2 0 at 4°C, and the temperature was raised to 60°C before 
the remaining PCR reagents (2.5 nmoles of dNTPs, lOpmol 
fluorescein amidite-labelled universal primers and 2.5 U of 
SALSA polymerase) were added to make the final reaction 
volume to 25 pi. PCR reaction was carried out in 33 cycles 
(95°C for 30 s, 60°C for 30 s and 72°C for 1 min), followed 
by a final extension at 72°C for 20 min. The final MLPA 
products were subsequently resolved on ABI 3730x1 
(Applied Biosystems, CA, USA), using the MRC Holland 
protocol (http://www.mrc-holland.com). The data were 
analyzed with the GeneMarker v. 1.85 software (SoftGenetics, 
PA, USA), in which population normalization was applied, and 
the peak areas were used to calculate relative dosage. 

Incorporation of control data from previously 
published studies 

In addition to 841 control individuals from the French cohort 
and 5919 WTCCC2 controls, we examined the frequency of 



1 q2 1 . 1 rearrangements in 4150 control individuals from previ- 
ously published studies (8,20,21) that used high-density SNP 
platforms comparable to the platforms used in this study 
(with coverage of >200 probes in the critical region of 
1 q2 1.1). For details, see Supplementary Material, Table SI. 

Statistical analysis 

Two-sided Fisher's exact tests were used to compare the fre- 
quency of NAHR-mediated 1 q2 1 . 1 duplications/deletions. 
Stata 1 1 was used to compute P-values, OR and 95% CI by 
Cornfield approximation. The frequency of small GJA5 dupli- 
cations was compared by maximum likelihood estimation 
using two binomial distributions (details can be found in Sup- 
plementary Material). PAR was calculated using the formula: 
100(P(OR- l))/(l + (P(OR- 1))), in which P is the proportion 
of control population with the lq21.1 duplication/deletion. 
Genomewide identity-by-descent (IBD) sharing was calcu- 
lated using 41692 autosomal SNPs in PLINK (42). IBD 
sharing for the lq21.1 locus was calculated using 268 SNPs 
in a roughly 3 Mb region around GJA5. 

SUPPLEMENTARY MATERIAL 

Supplementary Material is available at HMG online. 
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